3 research outputs found

    Advantages of using factorisation machines as a statistical modelling technique

    Get PDF
    Factorisation machines originated from the field of machine learning literature and have gained popularity because of the high accuracy obtained in several prediction problems, in particular in the area of recommender systems. This article will provide a motivation for the use of factorisation machines, discuss the fundamentals of factorisation machines, and provide examples of some applications and the possible gains by using factorisation machines as part of the statistician’s model-building toolkit. Data sets and existing software packages will be used to illustrate how factorisation machines may be fitted and in what context it is worth being used

    Algorithms for estimating the parameters of factorisation machines

    Get PDF
    Since the introduction of factorisation machines in 2010, it became a popular prediction technique amongst machine learners who applied the method with success in several data science challenges such as Kaggle or KDD Cup. Despite these successes, factorisation machines are not often considered as a modelling technique in business, partly because large companies prefer tried and tested software for model implementation. Popular modelling techniques for prediction problems, such as generalised linear models, neural networks, and classification and regression trees, have been implemented in commercial software such as SAS which is widely used by banks, insurance, pharmaceutical and telecommunication companies. To popularise the use of factorisation machines in business, we implement algorithms for fitting factorisation machines in SAS. These algorithms minimise two loss functions, namely the weighted sum of squared errors and the weighted sum of absolute deviations using coordinate descent and nonlinear programming procedures. Using a simulation study, the above-mentioned routines are tested in terms of accuracy and efficiency. The prediction power of factorisation machines is then illustrated by analysing two data sets

    Investigation of factorisation machines and its extensions to predictive models in a statistical context

    No full text
    PhD (Science with Business Mathematics), North-West University, Potchefstroom CampusFactorisation machines originated from the field of machine learning and have gained popularity because of the high accuracy obtained in several prediction problems, particularly in the field of recommender systems. As will be seen in the introductory discussion statisticians are largely unaware of factorisation machines and their capabilities and are seldomly considered by financial companies in their model-building endeavours. On the other hand, generalised linear models are frequently used by financial companies, and given the close relation of factorisation machines to these models, it is rather unfortunate that factorisation machines receive little attention. It is also surprising that factorisation machines are seldom if ever, mentioned as an alternative technique to benchmark models when problems in the financial services industry are discussed. Other than our papers, we have not been able to find a paper on factorisation machines published in a statistical journal. One of the reasons is probably since machine learners use terminology unfamiliar to statisticians. Almost all of the papers on factorisation machines are published in conference proceedings and in engineering and artificial intelligence journals. Although banks and insurance companies make use of open-source software, their production systems are usually coded in commercial software. Currently, commercial software is rather restricted in the routines that are offered to fit factorisation machines. So, to facilitate application of factorisation machines to a wide range of problems, a suite of fitting routines is needed. In this thesis, we introduce factorisation machines to the statistical community and show that they can be applied to a wide range of modelling problems. Specifically, we compare and relate factorisation machines to generalised linear models and we develop several fitting routines in popular commercial software. The accuracy of the routines is thoroughly checked employing simulation studies. The performance of factorisation machines is then compared to that of popular generalised linear models by using Monte Carlo simulation and several real-world data sets. Specific contributions of the thesis include the introduction of a new robust factorisation machine, based on mean absolute deviation, and logistic factorisation machines that perform well on binary classification problems where the number of events is much less than the number of non-events. We further motivate why and in what setting it will be beneficial to include factorisation machines as part of the Statistician’s model-building toolkit. During our research, understanding and analysis of FMs, we concluded that FMs can potentially be used to solve a range of exciting business problems, and we include some of these as recommendations for future research. Lastly, we believe that the routines developed in this thesis will result in the widespread application of factorisation machines in business. Also, we hope that the publications resulting from this thesis will stimulate research interest on this topic by statisticians.Doctora
    corecore